The Foundations of Thread-level Parallelism in the SuperMatrix Runtime System∗

نویسنده

  • Ernie Chan
چکیده

In this paper, we describe the interface and implementation of the SuperMatrix runtime system. SuperMatrix exploits parallelism from matrix computations by mapping a linear algebra algorithm to a directed acyclic graph (DAG). We give detailed descriptions of how to dynamically construct a DAG where tasks consisting of matrix operations represent the nodes and data dependencies between tasks represent the edges of the graph. We show the algorithm that, given a DAG as input, dispatches and schedules tasks to threads. Different scheduling heuristics and optimizations implemented as part of SuperMatrix are discussed, demonstrating the flexibility and portability that results from the separation of concerns. Using this flexible framework, we compare several scheduling algorithms, such as work stealing, that optimize for either load balancing or data locality, and we demonstrate that a relatively simple, single queue implementation provides exceptional performance while also allowing for the widest flexibility for further enhancements. Performance results from a sixteen core machine are provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Runtime Thread Management for the Nano-Threads Programming Model

The nano-threads programming model was proposed to effectively integrate multiprogramming on shared-memory multiprocessors, with the exploitation of fine-grain parallelism from standard applications. A prerequisite for the applicability of the nano-threads programming model is the ability of the runtime environment to manage parallelism at any level of granularity with minimal overheads. In thi...

متن کامل

Exploiting fine-grain thread parallelism on multicore architectures

In this work we present a runtime threading system which provides an efficient substrate for fine-grain parallelism, suitable for deployment in multicore platforms. Its architecture encompasses a number of optimizations that make it particularly effective in managing a large number of threads and with low overheads. The runtime system has been integrated into an OpenMP implementation to allow f...

متن کامل

Compiling Data-parallel Programs to a Distributed Runtime Environment with Thread Isomigration

Traditionally, the compilation of data-parallel languages is targeted to low-level runtime environments: abstract processors are mapped onto static system processes, which directly address the low-level IPC library. Alternatively, we propose to map each HPF abstract processor onto a “lightweight process” (thread) which can be freely migrated between nodes together with the data it manages, unde...

متن کامل

An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

We pursue the scalable parallel implementation of the factorization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. The SuperMatrix run-time system allows an out-of-order scheduling of operations that is transparent to the pr...

متن کامل

Runtime Support for Multigrain and Multiparadigm Parallelism

This paper presents a general methodology for implementing on clusters the runtime support for a two-level dependence-driven thread model, initially targeted to shared-memory multiprocessors. The general ideal is to exploit existing programming solutions for these architectures, like Software DSM (SWDSM) and Message Passing Interface. The management of the internal runtime system structures and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009